Understanding and Optimizing Heterogeneous Soft-Error Protection

نویسندگان

  • Lukasz G. Szafaryn
  • Mircea Stan
  • James Cohoon
  • Brett Meyer
چکیده

The trend of continuing technology scaling in circuits exaggerates effects of physical phenomena, such as particle strikes [1] and process variation [2], that cause soft errors . While recent advances in fabrication technology decrease the severity of these effects for the next transistor generation [3] [4], the trend of the increasing error rates inevitably continues with further scaling or change of the operating environment. As a result, it is not becoming necessary to protect a larger range of processor types, including commodity products, in addition to high-end servers [5] and safety-critical systems [6]. Maintaining the same level of resiliency with the constantly increasing processor complexity requires novel approaches to use a wider range of resiliency techniques for improved coverage and efficiency. While there is a broad range of existing and proposed protection techniques that span hardware, architecture and software layers, only a few of them made it to commercial products while most have only been analyzed individually on inconsistent platforms. Therefore there is little understanding of relative trade-offs between available solutions [7] which makes it difficult to arrive at optimal decisions for the resilient design. Moreover, none of the previous work attempted to compare the more efficient techniques from architecture and software layers and analyze their combinations with hardware-level solutions that could potentially tailor to the protection needs better. Furthermore, there is little analysis of recovery cost, crucial to the efficiency of protection. To better understand and optimize soft-error protection, we study the following research issues:: 1) the design of a framework for the analysis of protection overheads of traditionally used resilience solutions at different core or component granularities for the OpenRISC sample processor, 2) implementation of data-flow checking, an efficient architecture-level resilience technique, for the Leon and IVM processors to compare against low-overhead hardware and software level techniques, their complementary combinations and recovery methods, as well as 3) analysis of the hardware algorithm-specific protection technique for the FFT accelerator in comparison to the traditionally used hardware solutions to demonstrate the benefits of the algorithm-specific approach for accelerators.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Static Analysis to Mitigate Soft Error Failures in Processors

By 2011, the Integrated Circuit(IC) feature sizes are expected to be reduced to 22nm from present day 45nm, and the soft error rate will increase by 3 to 4 orders of magnitude (from one per year to one per hour). International Technology Roadmap for Semiconductors(ITRS) indicates that techniques for mitigating soft errors are crucial for future generations of Integrated Circuits. Soft errors ar...

متن کامل

Analyzing Soft Errors in Leakage Optimized SRAM Design

Reducing leakage power and improving the reliability of data stored in the memory cells are both becoming challenging as technology scales down. While the smaller threshold voltages causes increased leakage, smaller supply voltages and node capacitances can be a problem for soft errors. This work compares the soft error rates of some recently proposed SRAM leakage optimization approaches. Our r...

متن کامل

Optimizing the warranty period by cuckoo meta-heuristic algorithm in heterogeneous customers’ population

Warranty is now an integral part of each product. Since its length is directly related to the cost of production, it should be set in such a way that it would maximize revenue generation and customers’ satisfaction. Furthermore, based on the behavior of customers, it is assumed that increasing the warranty period to earn the trust of more customers leads to more sales until the market is sat...

متن کامل

Compiler Approach for Reducing Soft Errors in Register Files

With continuous technology scaling, soft errors are becoming an increasingly important design concern even for earthbound applications. While compiler approaches have the potential to mitigate the effect of soft errors with minimal runtime overheads, static vulnerability estimation—an essential part of compiler approaches—is lacking due to its inherent complexity. This paper presents a static a...

متن کامل

Control Flow Checking or Not? (for Soft Errors)

Control Flow Checking (CFC) techniques were proposed to provide efficient protection from soft errors. The main idea is that most soft errors will eventually manifest as errors in the sequence of instruction execution. Therefore, just by making sure that the sequence of instructions executed (or the control flow of the program) is correct, then significant protection can be achieved. Note that ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015